Extracting Terminologically Relevant Collocations in the Translation of Chinese Monograph
نویسندگان
چکیده
This paper suggests a methodology which is aimed to extract the terminologically relevant collocations for translation purposes. Our basic idea is to use a hybrid method which combines the statistical method and linguistic rules. The extraction system used in our work operated at three steps: (1) Tokenization and POS tagging of the corpus; (2) Extraction of multi-word units using statistical measure; (3) Linguistic filtering to make use of syntactic patterns and stop-word list. As a result, hybrid method using linguistic filters proved to be a suitable method for selecting terminological collocations, it has considerably improved the precision of the extraction which is much higher than that of purely statistical method. In our test, hybrid method combining “Log-likelihood ratio” and “linguistic rules” had the best performance in the extraction. We believe that terminological collocations and phrases extracted in this way, could be used effectively either to supplement existing terminological collections or to be used in addition to traditional reference works.
منابع مشابه
Collocational Translation Memory Extraction Based on Statistical and Linguistic Information
In this paper, we propose a new method for extracting bilingual collocations from a parallel corpus to provide phrasal translation memories. The method integrates statistical and linguistic information to achieve effective extraction of bilingual collocations. The linguistic information includes parts of speech, chunks, and clauses. The method involves first obtaining an extended list of Englis...
متن کاملExtracting Verb-Noun Collocations from Text
In this paper, we describe a new method for extracting monolingual collocations. The method is based on statistical methods extracts. VN collocations from large textual corpora. Being able to extract a large number of collocations is very critical to machine translation and many other application. The method has an element of snowballing in it. Initially, one identifies a pattern that will prod...
متن کاملThe Effect of Mobile-Assisted Teaching of Collocations on Reading Ability of Iranian EFL Learners
This study aimed to discover the effect of mobile-assisted teaching of collocations on Iranian EFL learners’ reading achievement. For this purpose, a PET test was given to 85 intermediate EFL learners as the proficiency test. After homogenization, 30 female and male students within the age range of 16 to 30 years old from an institute in Alborz Province were selected as the participants in the ...
متن کاملExtracting collocations and their translations from parallel corpora
Identifying collocations in a text (e.g., break record) and correctly translating them (battre record vs. *casser record) represent key issues in machine translation, notably because of their prevalence in language and their syntactic flexibility. This article describes a method for discovering translation equivalents for collocations from parallel corpora, aimed at increasing the lexical cover...
متن کاملIssues in defining/extracting collocations in Japanese and Korean: Empirical implications for building a collocation database
Collocations in Japanese and Korean have been studied extensively based on statistical tools. The criteria for collocations in these languages, however, have not been fully established in the literature, and it is not obvious whether all statistically significant combinations of words could be regarded as collocations. In this article, we point out empirical problems in extracting collocations ...
متن کامل